Improving Label Noise Filtering by Exploiting Unlabeled Data
نویسندگان
چکیده
منابع مشابه
Exploiting Unlabeled Data for Improving Accuracy of Predictive Data Mining
Predictive data mining typically relies on labeled data without exploiting a much larger amount of available unlabeled data. The goal of this paper is to show that using unlabeled data can be beneficial in a range of important prediction problems and therefore should be an integral part of the learning process. Given an unlabeled dataset representative of the underlying distribution and a K-cla...
متن کاملDocument Filtering Boosted by Unlabeled Data
This paper describes three learning methods for document filtering that use unlabeled data. The proposed methods are based on a committee of the classifiers which are trained on a small set of labeled data and then augmented by a large number of unlabeled data. By taking advantage of unlabeled data, the effective number of labeled data needed is significantly reduced and the filtering accuracy ...
متن کاملEfficient Model Selection for Regularized Classification by Exploiting Unlabeled Data
Hyper-parameter tuning is a resource-intensive task when optimizing classification models. The commonly used k-fold cross validation can become intractable in large scale settings when a classifier has to learn billions of parameters. At the same time, in real-world, one often encounters multi-class classification scenarios with only a few labeled examples; model selection approaches often offe...
متن کاملLearning with Augmented Class by Exploiting Unlabeled Data
In many real-world applications of learning, the environment is open and changes gradually, which requires the learning system to have the ability of detecting and adapting to the changes. Class-incremental learning (CIL) is an important and practical problem where data from unseen augmented classes are fed, but has not been studied well in the past. In C-IL, the system should beware of predict...
متن کاملExploiting Unlabeled Data Using Improved Natural Langua
This paper presents an unsupervised method that uses limited amount of labeled data and a large pool of unlabeled data to improve natural language call routing performance. The method uses multiple classifiers to select a subset of the unlabeled data to augment limited labeled data. We evaluated four widely used text classification algorithms; Naive Bayes Classification (NBC), Support Vector ma...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2018
ISSN: 2169-3536
DOI: 10.1109/access.2018.2807779